-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUGFIX: Invalidate caches correctly after node move changes have been discarded #4291
BUGFIX: Invalidate caches correctly after node move changes have been discarded #4291
Conversation
47e4a35
to
184692d
Compare
@@ -146,6 +146,20 @@ public function registerNodeChange(NodeInterface $node, Workspace $targetWorkspa | |||
*/ | |||
protected function registerAllTagsToFlushForNodeInWorkspace(NodeInterface $node, Workspace $workspace): void | |||
{ | |||
// Ensure that we're dealing with the variant of the given node that actually | |||
// lives in the given workspace | |||
if ($node->getWorkspace()->getName() !== $workspace->getName()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grebaldi do you think this could be a performance problem when working with a large number >1000 nodes?
Like publishing a whole workspace and for some reason creating a large number of contexts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question :)
Context creation itself doesn't strike me as very expensive, so I guess that would not be a problem. $workspaceContext->getNodeByIdentifier
however will be quite expensive given an n=1000 iteration. It cannot be avoided though (IIRC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fyi i was curious when this Workspace $workspace
parameter will actually different.
And i followed a bit the path an it seems its only set when publishing (and then its likely the base workspace of course):
neos-development-collection/Neos.ContentRepository/Classes/Domain/Service/PublishingService.php
Line 267 in 595fcba
public function emitNodePublished(NodeInterface $node, Workspace $targetWorkspace = null) |
The ContextFactory::create
seems to be cached but i agree the getNodeByIdentifier
in 158 below might be more expensive.
... But then again isnt this cached? Yes. The firstLevelNodeCache
actually caches the nodes by identifier ... huh .. only identifier not workspace? So my thesis is that the node returned is actually the same as passed (as that one likely exists in the firstLevelNodeCache
.
The only thing to avoid this cache layer is to use the nodeDataRepository
directly.
Maybe we should even consider extending the test flushingANodeWithAnAdditionalTargetWorkspaceWillAlsoResolveThatWorkspace
to test this and be sure about it. ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah i see each Context
has its own firstLevelNodeCache
, that makes the caching of course workspace ware, which makes sense. And as we create the context just at that very second, the cache is completely empty so indeed we will query the content repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No wait please ignore the following stuff i think its none sense :D
I dont fully understand this yet, but if we only want to ensure that $node->getWorkspace()->getName() === $workspace->getName()
we can cheat by building our own node:
$workspaceContext = $this->contextFactory->create(
array_merge(
$node->getContext()->getProperties(),
['workspaceName' => $workspace->getName()]
)
);
$correctNode = $this->nodeFactory->createFromNodeData($node->getNodeData(), $workspaceContext);
but in case that is not our sole goal but we want to really ask the database, so the if ($node === null) return;
will be used, that cheating will of course not work.
184692d
to
c99cd8b
Compare
Rebased on 8.3 |
…collection#4291 This needs to be undone before merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So expensive but I also don't see another way to do this, thanks!
…cache-invalidation-on-discard
* @return void | ||
* @Flow\Signal | ||
* @api | ||
*/ | ||
public function emitNodeDiscarded(NodeInterface $node) | ||
public function emitNodeDiscarded(NodeInterface $node, ?Workspace $baseWorkspace = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi due to this this change needs a proxy cache flush as the Neos Neos PublishingService
must be rebuild.
Declaration of Neos\Neos\Service\PublishingService::emitNodeDiscarded(Neos\ContentRepository\Domain\Model\NodeInterface $node) must be compatible with Neos\ContentRepository\Domain\Service\PublishingService::emitNodeDiscarded(Neos\ContentRepository\Domain\Model\NodeInterface $node, ?Neos\ContentRepository\Domain\Model\Workspace $baseWorkspace = null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About how this depends on the Neos Ui part i wrote a conclusion here neos/neos-ui#3503 (comment) - so its not a problem if a project updates only the ui or neos and not both though to fix all bugs both should be updated for sure.
Tested in a big project as patch and could not find hint about a performance impact.
Copy document into other (932 changes)
Neos 8.3.8
copy and discard
| 25s | 2.1 |
| 13s | 2.0 |
| 19s | 2.1 |
copy and publish
| 23.9s | 7.6 |
| 20.5s | 7.5 |
delete copy and publish
| 9.9s | 9.7 |
| 8.9s | 7.8 |
Neos 8.3.8 + #4291
copy and discard
| 13.4 | 2.2 |
| 23.7 | 2.2 |
| 21.7 | 2.2 |
copy and publish
| 17.5 | 6.7 |
| 23.5 | 8.5 |
delete copy and publish
| 8.3 | 9.0 |
| 10.6 | 9.9 |
to move two document with children to other place - 1043 change - and discard
Neos 8.3.8
| 3.0s | 31.0 |
Neos 8.3.8 + #4291
| 5.3s | 30.5 |
| 2.3 | 15 |
| 3.5 | 28.8 |
NOTE: This PR needs to be tested in combination with its companion PR over at
neos-ui
: neos/neos-ui#3503The overall problem
This one's a bit complex and it stretches over two repositories. The basic scenario is:
neos/neos-ui#3184 describes the issues that happen only partially. During my investigation I found several problems that I need to disect piece by piece.
(I) The UI does not (entirely) recognize that nodes have been moved to a position below a different parent node
The problem
Peek.2023-05-21.15-14.-.3184.what.happens.on.discard.mp4
In the video you can see that after the two document nodes have been moved, the change is not made visible by the usual orange change indicator on the left hand side of the tree nodes. Also, on the attempt to discard the current set of changes, an error occurs that reads:
Both phenomena are related, because - as it turns out - the
UpdateWorkspaceInfo
feedback object, that is supposed to inform the UI about pending changes, delivers the wrong node context paths for the nodes that have just been moved (the context paths are still the old ones).Because the context paths are now out of sync, the UI is unable to associate the pending change with the respective tree node. It then also uses the stale workspace information as payload to the discard command, which leads to the above error at the following line:
https://github.com/neos/neos-ui/blob/5c52e08b8a1effc985911390a7124579c4018c25/Classes/Controller/BackendServiceController.php#L272
How come the context paths are incorrect after the nodes have been moved?
After some investigation I found that theNeos\ContentRepository\Domain\Factory\NodeFactory
class memoizes stale data - as opposed to e.g.ContentContext
which gets its in-memory cache flushed when nodes were moved:neos-development-collection/Neos.ContentRepository/Classes/Domain/Factory/NodeFactory.php
Line 81 in 0e28ef2
When thePublishingService
is asked for unpublished nodes viagetUnpublishedNodes
, it receives cache hits withinNodeFactory->createFromNodeData
for the nodes that have just been moved - thus the old context paths.EDIT: Not at all true 😅! I investigated further after the functional tests over here failed (hooray for functional tests!). The actual reason is as follows:
The Neos UI API uses a slightly different content context configuration than
getUnpublishedNodes
. So,NodeFactory
actually keeps track of two variants of the moved nodes. The ones thatgetUnpublishedNodes
receives are not the ones that the move operation has been performed on.The solution
The solution for this is implemented over here: neos/neos-ui#3503
(II) Discarded move changes are not properly reflected in the document tree
The problem
Problem (I) can be circumvented by hard-reloading the UI (after that, the workspace info will be correct again). But, there's still some strangeness going on...
Peek.2023-05-21.15-15.-.3184.what.happens.on.discard.after.reload.mp4
In the video you can see that the tree actually reflects the discarded changes correctly for a brief moment there. It then quickly jumps to a broken state in which the nodes that should be at their original positions just disappear.
This problem persists even if you hard-reload the UI after discarding:
Peek.2023-05-23.17-37.-.3184.what.happens.on.hard.reload.mp4
Now, if you use the reload button of the document tree to manually reload the tree, the nodes reappear:
Peek.2023-05-23.17-38.-.3184.what.happens.on.tree.reload.mp4
(But also: If you hard-reload the UI again, the nodes will once again flash briefly and then disappear)
How does this happen?
In those videos, the parent document that originally contained the moved nodes is focused and open in the guest frame. After discard, it should contain those nodes again. The UI reloads the guest frame, but the document is now rendered with stale node metadata. After the guest frame finished loading, the stale metadata (which still thinks the nodes have been moved elsewhere) overwrites the node data in the UI redux store.
This is why the correct state shows up for a brief moment. It gets overwritten after a short delay when the guest frame is loaded. (Also: The nodes do not disappear if you focus a different document and hard-reload the UI).
Looking at the cache configuration for the node metadata:
https://github.com/neos/neos-ui/blob/5c52e08b8a1effc985911390a7124579c4018c25/Resources/Private/Fusion/Prototypes/Page.fusion#L26-L46
... one should actually assume that the data shouldn't be stale (
Neos.Caching.descendantOfTag(documentNode)
should do the trick). It turns out though, that theNeos\Neos\Fusion\Cache\ContentCacheFlusher
class - in case of discard - will only flush tags related to a node's current workspace. We actually need to have all tags flushed in the base workspace as well to cover theDescendantOf_*
-tag of the original parent node.The solution
I'm not entirely sure about this. I did two things:
nodeDiscarded
signal so that it carries the base workspace of each discarded nodeThe reason behind this is that the
ContentCacheFlusher
listens to both thenodeDiscarded
and thenodePublished
signal with its methodregisterNodeChange
.nodePublished
carries the target workspace of the publishing operation, which is accepted as a second argument byregisterNodeChange
.nodeDiscarded
used to not carry such "target workspace" information, so that theContentCacheFlusher
had no chance to actually flush tags for this workspace.It feels to me like a bit of a stretch on semantics to just add the base workspace to the
nodeDiscarded
signal in such fashion, but I found it to be the least invasive yet most effective solution.registerAllTagsToFlushForNodeInWorkspace
method of theContentCacheFlusher
actually deals with the given node in the given workspace.registerAllTagsToFlushForNodeInWorkspace
takes a node and a workspace as arguments. It used to just assume that the given node is actually its variant in the given workspace, which seems to be working in a lot of cases, but in the case of discarded node move changes, the method makes false assumptions. It goes up the parent chain of the given node to find allDescendantOf_*
-tags that need to be flushed. The given node however has the wrong parent.I therefore added some code at the start of the method, that replaces the
$node
variable with the given node's variant in the given workspace, if the respective workspace names differ.Related Commit(s): f97268a, 184692d
(III) Discarding a node move while having a moved document node open in the guest frame results in an error page
The problem
Peek.2023-05-22.14-30.-.3184.Discard.Page.Move.While.On.Page.mp4
The video shows that when you're currently editing a moved document and then discard the move change, the guest frame reloads and shows a misleading fusion error. This is because the guest frame tries to render a document node that doesn't exist anymore.
A similar situation would be a document that has just been created. If you stay on that document and then discard it, the UI behaves correctly and redirects you to the next-higher document:
Peek.2023-05-23.17-46.-.3184.discard.new.node.mp4
Here, the problem lies within the
discardAction
of theNeos\Neos\Ui\Controller\BackendServiceController
which does not recognize discarded move changes and thus misses to inform the UI that it needs to remove the nodes at their former positions and re-insert them at their original positions.The solution
The solution for this is implemented over here: neos/neos-ui#3503
All solutions combined
Here's what it looks like when the PRs in
neos-ui
andneos-development-collection
are combined:Peek.2023-05-23.17-48.-.3184.all.changes.combined.mp4